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Abstract — The purpose of this article is to examine the greedy 
adaptive measurement pohcy in the context of a Unear Guas- 
sian measurement model with an optimization criterion based 
on information gain. In the special case of sequential scalar 
measurements, we provide sufficient conditions under which the 
greedy policy actually is optimal in the sense of maximizing the 
net information gain. In the general setting, we also discuss cases 
where the greedy policy is not optimal. 

Index Terms — entropy, information gain, compressive sensing, 
compressed sensing, greedy policy, optimal policy. 



I. Introduction 

CONSIDER a signal of interest x, which is a random 
vector taking values in with (prior) distribution 
JV{n, Pq) (i.e., Gaussian distribution with mean fi and NxN 
covariance matrix Pq). We wish to estimate x based on M 
measurements of it (where M is specified upfront). The fcth 
measurement (fee {!,..., M}) is given by 

yk = Akx + Wk, (1) 

where y^, takes values in M^, and the noise has distribution 
JV{0, R), independent over k. The measurement matrix is 

Lx N. 

Consider the following adaptive (sequential) measurement 
problem. For each k e {1,...,M}, we are allowed to 
choose the measurement matrix Ak from a prespecified set A. 
Moreover, our choice is allowed to depend on the entire history 
of measurements up to that point: Ik-i — {Ui, ■ ■ ■ , Vk-i}- 

Let the posterior distribution of x given Xk be N{xk,Pk)- 
More specifically, Pk can be written recursively for k — 
l,...,M as 

Pk=(l- Pk-iAl{AkPk-iAl + R)-^Ak) Pk-i- (2) 

If this expression seems a little unwieldy, a simpler version is 
as follows: 

-1 



k = 



fe-i 



ATR-'Ak 



(3) 



assuming that Pk-i and R are nonsingular. Also define the 
entropy of the posterior distribution of x given Ik'- 

Hk = ^logdet(Pfe) + ^iV(l + log(2^)). (4) 

We focus on a common information-theoretic criterion for 
choosing the measurement matrices: For the fcth decision, we 
pick Ak to maximize the per-stage information gain, defined 
as Hk-i — Hk- For reasons that will be made clear later, we 
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refer to this strategy as the greedy policy. The term policy 
simply refers to a rule for picking Ak for each fc based on 

Suppose that the overall goal is to maximize the net in- 
formation gain, defined as — Hm- We ask the following 
questions: Does the greedy policy achieve this goal? If not, 
then what policy achieves it? How much better is such a policy 
than the greedy one? Are there cases where the greedy policy 
does achieve this goal? In Section |llj we analyze the greedy 
policy and compute its net information gain. In Section III 
to find the net information gain of the optimal policy, we 
introduce a relaxed optimization problem, which can be solved 
as a water-filling problem. In Section IV we derive two 



sufficient conditions under which the greedy policy is optimal. 
In Section [Vj we give examples for which the greedy policy 
is not optimal. We also show that the greedy policy can be 
arbitrarily worse than the optimal policy. 

II. Greedy Policy 

A. Preliminaries 

We now explore how the greedy policy performs in the 
adaptive measurement problem. Before proceeding, we first 
make some remarks on the information gain criterion: 

• Information gain as defined in this paper also goes by 
the name mutual information (of x and in the case of 
per-stage information gain, and of x and [yi, . . . ,yi^j) 
in the case of net information gain). 

• The net information gain can be written as the cumulative 
sum of the per-stage information gains: 

M 

Hq — Hm = ^(i/fe-i — Hk). 

k=l 

This is why the greedy policy is named as such; at 
each stage fc, the greedy policy simply maximizes the 
immediate (short-term) contribution Hk-i — Hk to the 
overall cumulative sum. 

• Using the formulas Q and Q for Hk and Pk, we can 
write 

Hk-i~Hk = -^logdet (J 
^Pk-iAl{AkPk-iAl + Rr^Ak). (5) 

In other words, at the fcth stage, the greedy policy 
minimizes (with respect to Ak) 

logdet (/ - Pk-iAliAkPk-iAl + R)-'Ak) . 

(6) 

• Equivalently, using the other formula ([3]l for Pk, the 
greedy policy maximizes 



log det (^P + A^R ^A^ 



(7) 
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in every stage. For the purpose of optimization, the log 
function in the objective functions above can be dropped. 

Notice that the greedy policy does not use the values of 
III, . . . its choice of Ak depends only on Pk-i and 

R. In fact, the formulas above show that information gain is a 
deterministic function of the of measurement matrices (in our 
particular setup). This implies that the optimal policy can be 
computed by deterministic dynamic programming. In general, 
we would not expect the greedy policy to solve such a dynamic 
programming problem. However, as we will see in following 
sections, there are cases where it does. 



B. Sequential Scalar Measurements 

In this subsection we consider the special case where L = 1 
(i.e., each measurement is a scalar). In this case, we can 
write Ak — a^, where G M^, R = a^. This scalar 
measurements case is of special interest because it corresponds 
to the problem of choosing the rows of a M x measurement 
matrix $ sequentially, one at a time, common in discussions of 
adaptive compressive measurements. In this interesting special 
scenario. 



step IS 



y = + w, 



(8) 



where y e M*^ is called the measurements vector, and ti; is a 
white Gaussian noise vector In this context, the construction 
of a "good" measurement matrix $ (which would convey more 
information about x) is also a topic of interest. The concept of 
sequential scalar measurements in a closed-loop fashion has 
been discussed in a number of recent papers; e.g., |[3], ||6j, 
|j8], pTj , |[T2|. The objective function for the optimization 
here can take a number of possible forms, besides the net 
information gain. For example, in | fT2| , the objective is to 
maximize the posterior variance of the expected measurement. 

In this paper we investigate applying the greedy policy in 
designing the measurement matrix. If the can only be 
chosen from a prescribed finite set, the optimal design of 
$ is essentially a sensor selection problem (see |13J , ||16[), 
where the greedy policy has been shown to perform well. 
For example, in the problem of sensor selection under a 
submodular objective function subject to a matroid constraint 
p4) , Q, fTT) , the suboptimality of the greedy policy has a 
provable bound. 

Consider a constraint of the form ||afc|| < 1 for A; = 
1, . . . , Af (II • II is the Euclidean norm in M^), which is much 
more relaxed than a finite prescribed set. The expression in 
(|6]l simplifies to 



logdet / 



which further reduces (see P, Lemma 1.1]) to 



alPk-iak 



logdet 1 



(9) 



(10) 



Hk-, -Hk = - J logdet (l - P^-^^'^^l 



(11) 



Apparently, we should maximize aj^ Pk-iak to obtain the 
maximal information gain in the fcth step. We denote the 
eigenvalues of Pq by Ai > A2 > • • • > Xn > 0. 
Since Pq is a covariance matrix, which is symmetric, there 
exist corresponding orthonormal eigenvectors t^i, f 2, ■ ■ ■ ,vn, 
respectively. Clearly, we should set ai equal to Vi, which is 
the eigenvector of Pq corresponding to its largest eigenvalue 
A? := Ai. Then Pi = {P^^ + a-^vivj)-^, and we can 
verify the following: 



V, = PiP^^v, = Pi(Po ' + a-^viv'[)v, 
= -^PiVi, for i ^ 1, 

vi = PiPi^Vi = PiiPo^Vi + a-^vi) 
1 



Ai 



PlVl. 



(12) 



(13) 



So we claim that Pi has the same collection of 
eigenvectors as Pq , and the eigenvalues of Pi are 
Ai/(1 + (T^^Ai), A2, . . . , Xn- By induction, we conclude that, 
when applying the greedy policy, all the Pfc's for k = 
0, . . . ,M have the same collection of eigenvalues and the 
greedy policy always picks vectors from the set of eigenvectors 

{Vi,...,Vn}- 

Denote the eigenvalues of Pk by Ai > A^ > . . . > A^. 
After applying M iterations of the greedy policy, the net 
information gain is 

M 



Hq - Hm = max {Hk-i - Hu) 
fe=i " 

M ^ . 



k=l 
M 



,A^i+a2 

-Jiogfla + cT-^Ar) 



(14) 



k=l 



III. Relaxed Optimal Policy 

In this section we consider the problem of maximizing the 
net information gain 



k=l 



1 *^ 



det(Pfc) 

2^1'"° det(Pfe_i) 

1 dct(Po) 
log ■ 



Combining (j5]), (|9]l, and ( lOl, the information gain in the fcth 



2 ^det(PM) 

1 ^ 
= -logdet(Po)det(p^i+a-2^afca|^), (15) 



fc=i 
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subject to ||afe|| < 1, fc = 1,...,M. This maximiza- 
tion problem is actually equivalent to the maximum a 
posteriori probability (MAP) problem (see |1| and [17|). 



Again by 14 



Note that ^^o-i is positive semidefinite. Let G := 

a^^[ai, a2, ■ ■ ■ ,aM], which is iV x AI. The constraint can 
be written as {G^ G)ii < for i = 
Lemma LI], we obtain 

logdet(Po)det (p^^ +a 



M 



fe=i 



= logdet(Po)det(Po 1 + GG^) 
= logdct(JM + G^PoG), 



(16) 



where I m is the M x M identity matrix. 

Now consider the relaxed constraint trG^G = 
'^~'^^kLi\\0'k\\^ ^ cr^^A/. There exists a unitary matrix 
U := [vi,...,Vn], such that cj^'^MPq = U'^AU, where 
A — CT~^Mdiag(Ai, A2, . . . ,Xn) and Ai > A2 > . . . > Aa? 
are the eigenvalues of Pq- The notation diag(Ai, A2, . . . ,Xn) 
represents the diagonal matrix with entries Ai, A2, . . . , \n- As 
the matrix aM^^U is nonsingular, G ^ G — aM^^UG 
maps the set of x Af matrices one-to-one onto itself. 
The constraint ivG^G < a^^M becomes trG G < 1. We 
continue with ( fT6| ) to write 

logdet{I M + G^PoG) 



logdet(JM + G AG). 



(17) 



Hence, the relaxed optimization problem is equivalent to the 
following maximization problem: 



Maximize ^logdet(/M 
subject to trG G < 1. 



G^AG), 



(18) 



Recall the following known results from p9) . 

Lemma 1: Given Ai > A2 > . . . > Am > 0, there exists a 
unique integer r, with 1 < r < M, such that for 1 < fc < r 
we have 



1 1 

Xk k 



1 



while for indices fc, if any, satisfying r < k < M we have 



(19) 



1 

— > 
Xk k 



k 



(20) 



Lemma 2: For Ai > A2 > 
Lemma [T] the sequence 



1 

■ ■ > Am > and r as in 



mfe = k-^i^l + 
is strictly increasing. 



,M, (21) 



By applying 1 19 Theorem 2], the optimal value of the 
relaxed maximization problem ( fTS] ) is 



llog(.-^(.-M + ;^A-^)^nA,), 



(22) 



P2 



P3 



Pi 



A3^ 



A: 



Fig. 1. Water-filling solution. 



where r is defined by the M biggest eigenvalues of Pq, 
Ai, A2, . . . , Am, as in Lemma [T] 

In fact, ( |22l ) is obtained from the solution of the well known 
water-filling problem (see f7T| for details). It is known that 



is the the optimal value of the following maximization prob- 
lem: 



M 



Maximize ]^(l + cr "^MXiPi) 

i=l 
M 

subject to < 1. 



(23) 



The maximal value is only obtained when 

p, - (/i-^2^f-iAri)+, i = l,2,...,M, (24) 

where 

r 

^:=r-i(l + ^a2A/-iAri) (25) 



is called the water level. By taking a close look at (24i, we 
can see that pi > . . . > Pr > and Pr+i — ... = Pm = 0. 
Figure [1] illustrates the relation among A^ :— a^MXi, pi, and 
water level ^. 

For the sake of transparency, we make the following remark. 
Let V := Gcf . Then 

dct(/M + G^AG) = dct(JAr + Ay). (26) 

Denote the eigenvalues of V as /ii > /i2 > . . • > 5^ 0- 



4 



Since 



V = a^M-^UGG^U^ 

M 



(27) 



at most M of the /i^ are positive. Moreover, X^ili Mi = 
trG^G < 1. By 1 19, Lemma 3], 



N 



dGt(/„ + AV) < + a-Hl\fx,) 



M 



(28) 



where /i' = cr^^Mfii. EquaHty holds if and only if A 
and V commute. Since A is a diagonal matrix, if V = 
a"^ M~^U^ GG^ U is also diagonal, they commute. 

In practice, if G = ct^^ [01,02, • • • jQa/] is obtained from 
a realization of the greedy algorithm, its columns are from the 
set of orthonormal eigenvectors. This implies that A and V 



commute, and ( 28 1 holds with equality. Furthermore, each ^[ 
is a multiple of cr^^ because it depends on the multiplicity of 
appearance of a specific eigenvector v G {vi, . . . , vjv} as a 
column of G. 

IV. When Greedy Is Optimal 

In the preceding sections, we have discussed three types 
of policies: the greedy policy, the optimal policy, and the 
relaxed optimal policy. Denote by Hq, Hq, and Hj^ the 
net information gains associated with these three policies 
respectively. Clearly, 



Ha < Ho < Hr, 



(29) 



We now provide two sufficient conditions under which the 
greedy policy is optimal (i.e., Hq — Hq) for the sequential 
scalar measurements problem ([8]l. 

As defined before, Ai > . . . > Xn are the eigenvalues of 
Pq, M is the number of measurements, and r is defined as 
in Lemma [T] 

Theorem 3: Assume that each row of # can be selected to 
be any row vector with ||a|| < 1. If Xj^^i — X^^ = Ukcr^^, 
where Uk is some nonnegative integer, for k — 1, . . . , r, then 
the greedy policy is optimal. 

Proof: In the fcth iteration of the greedy algorithm, 
the algorithm changes the largest eigenvalue A^^^^ into 
Ai"V(l + o""^^!^^)- This is equivalent to adding a^^ to 
1/A5;-\ If we consider the whole process of the greedy 
algorithm, it simply allocates M blocks of size a^^ one by 
one to the channel corresponding to the biggest eigenvalue of 
Pk-i- 

Since X^^^ — A^T^ = Uka^^, k = l,...,r, the greedy 
solution fills the blocks of size a^^ will give the same maximal 
value of nf=i(l + ^il^'i) ^s the water-filling solution. There- 
fore, Hg = Hfj- Form ((29|, we conclude that Hg = Hq- 



The next result provides an alternative sufficient condition 
for greedy to be optimal, based on eigenvectors of Pq. 



XT' 



move this block to obtain r) 



7r 



Fig. 2. Obtain allocation rj from -y. 



Theorem 4: Suppose that the rows of # can be picked from 

1 SI 

vj} C S, then the greedy 



a set 5 C {vj , . . . , vjf}, which is a subset of the orthonormal 



eigenvectors of Pq. If {vj 
policy is optimal. 

Proof: If we pick rows of 4> from {vf 



}, F is a 



diagonal matrix as we analyzed in d27b. Then A and V com 



mute and the equality in (28 1 holds. Hence, the net information 
gain is nfcli(l + ^kl^'k)- ^^^^ claimed that each /i'j. is a 
multiple of a^^ and X^feLi Mfe — c^^Af . The optimal solution 
is simply the best allocation of M blocks which maximizes 
nfcLi(l + '*^feA*fc)- Assume that 7 = (71, 72, ■ ■ ■ , 7m), gives an 
optimal solution. If (7i+A~^— ^) > cr"^ for some 1 < J < M, 
where /i is the water level defined in ( |25] l, this means there 
exist some channel 1 < j < M, such that (7^ +XJ^ — /i) < 0. 
Now move the top block of the ith channel to the jth channel 
to get another allocation 77 = {iji, . . . ,1]^). Clearly, rj and 
7 have the same entries except the ith and jth ones. The 
argument in this paragraph is illustrated in Figure [2] 
Write 7fc + Afe ^ = M + Sk for fc = 1, . . . , M. So 

tM 



mii(i+A.%) 



nf;.(i+Afc7.) 

(l + A,(^ + (5, 



a; 



0) 



(l + A,(/i + 5, -A-^)) 
il + X,{p + S,-X-' +a-^)) 

il + K{^i + S,-XJ')) 
_ in + S,- a-^)i^i + 6j + g-^) 



(30) 



(/i + di){ii + 6j) 



> 1, 



(31) 



since Sj — Si > a 



Thus T] gives a better allocation and 



a contradiction to the optimality of 7. By similar arguments, 
we obtain that for optimal solution 7, there also does not 
exist i such that (7^ + A,^^ — fJ-) < (t~^. In conclusion, in 
the optimal solution the water level in each channel deviates 
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from /i less than cr^. This also means in an optimal allocation 
7 = (71, 72, • • • , 1m), Ir+i = . . . = 7A/ = 0. Since we only 
need to fill blocks to the first r channels to obtain an optimal 
allocation, only , . . . ^vj are needed in S. 

It turns out that we can show an identical property for 
the greedy policy. Assume that 77 — (771, 772, . . . , TyAr) gives 
a greedy solution. If 77^ > and (77^ + A^^ — /i) > cr^^, 
for some 1 < i < N , this implies that there exists a channel 
1 < j < M, such that {rjj + Xj^ — ^) < 0. Therefore, when the 
greedy algorithm fills the last block to channel i, it does not 
add that block to a channel whose level is lower This gives a 
contradiction. By similar arguments, there does not exist some 
channel i such that {rji + A^^ — /i) < — cr^^. This implies that 
in the greedy solution the water level in each channel deviates 
from /i less than cr^^. Moreover, the greedy algorithm will 
never fill blocks to channels other than the first r of them. 



This implies that only 



, vi. are needed in S. 



Consequently, both the greedy algorithm and optimal al- 
location meet the following stage in the allocating process: 
if we add one block to any channel i for 1 < f < r, the 
water level in that channel will be above fi. Assume that at 
this stage we have already allocated M' blocks and there are 
still r' more blocks to be allocated. Easy to check with the 
argument of the total volume (Ma^^) that r' :— M — M' < r. 



To maximize ( 28 1, we can verify that the optimal solution is 



simply adding those r blocks to the channels with r lowest 
water levels respectively. Otherwise, use above arguments we 
can always find a better allocation. On the other hand, the 
greedy algorithm will do the same thing with those r' blocks. 
So the greedy policy is exactly the optimal policy. ■ 

V. When Greedy Is Not Optimal 

A. An Example with Non-Scalar Measurements 

In this subsection we give an example where the greedy 
policy is not optimal. Indeed, as we will see, the greedy policy 
can be arbitrary worse than the optimal policy. Suppose that 
we are restricted to a set of only three choices of Ak'. 

A - |diag(l, 0), diag(0, 1), ^diag(l, 1) 

Note that diag(l, 1) — I. In this case, L = N = 2. Moreover, 
set M = 2, Po = 16/, and i? = /. 

Let us see what the greedy policy would do in this case. 
For fc 1, it would pick Ai E A to maximize 



dct(ll- 



A quick calculation shows that for Ai — diag(l,0) or 
diag(0, 1), we have 



whereas for Ai = idiag(l, 1), 

det I^^I + {A,r 
So the greedy policy picks Ai = idiag(l, 1), which leads to 

Pi = '41. 



17 

256' 



25 
256' 



For fc = 2, we go through the same calculations: for A2 = 

diag(l,0) or diag(0, 1), we have 



detl-I 



whereas for A2 — idiag(l, 1), 



det 



16 



105 
256 

81 
256' 



So, this time the greedy poHcy picks A2 = diag(l,0) (or 
diag(0, 1)), after which det(P2) = 256/105. 

Consider the alternative policy that picks Ai — diag(l,0) 
and A2 ~ diag(0; 1). In this case, 

P2' ^^1 + diag(l, 0) + diag(0, 1) = ^ J (32) 
16 16 

and so dct(P2) = 256/289, which is clearly less than what 
was obtained with the greedy policy. Call this alternative 
policy the alternating policy (because it alternates between 
diag(l,0) and diag(0, 1)). 

Conclusion: For this example, the greedy policy is not 
optimal with respect to the objective of maximizing the net 
information gain. 

How much worse is the objective function of the greedy 
policy relative to that of the alternating policy? Suppose that 
we set Po = a^^I and let the third choice in A be a^^'^I, 
where a > is some small number. (Note that the numerical 
example above is a special case with a = 1/16.) In this case, it 
is straightforward to check that the greedy policy picks Ai ~ 
a^/'*/ and A2 — diag(l, 0) (or diag(0, 1)) if a is sufficiently 
small, resulting in 



det(P2) = 



1 



which increases unboundedly as q — > 0. However, the alter- 
nating policy results in 

det(P2^ ^ 



(l + a)2' 

which converges to 1 as a 0. Hence, letting a get arbitrarily 
small, the ratio of the objective function for the greedy policy 
to that of the alternating policy can be made arbitrarily large. 
This means that the greedy policy is arbitrarily worse than the 
alternating policy. 

What went wrong? The greedy policy was "fooled" into 
picking Ai = a^/"^! in the first stage, because this choice 
maximizes the per-stage information gain in the first stage. But 
once it does that, it is stuck with its resulting covariance matrix 
Pi. The alternating policy trades off the per-stage information 
gain in the first stage for the sake of better net information gain 
over two stages. The first measurement matrix diag(l, 0) "sets 
up" the covariance matrix Pi so that the second measurement 
matrix diag(0, 1) can take advantage of it to obtain a superior 
covariance P2 after the second stage, embodying a form of 
"delayed gratification." 

Interestingly, the argument above depends on the value of 
a being sufficiently small. For example, if a = 1/4, then 
the greedy policy has the same net information gain as the 
alternating policy, and is in fact optimal. 
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B. An Example with Scalar Measurements 
Assume that 

3 2 

R = I, and set M = 2. Our goal is to find ||a|| 
that a, b maximize the net information gain: 

Ho-H2 = ^ logdet(Po) det{P^^ + aa^ + bb^). (33) 

By simple computation, we know that the eigenvalues of Pq 
are X\ — b and = 1. If we follow the greedy policy, the 
eigenvalues of Pi are A} = 1 and A2 = 5/6. By ( 14 1, the net 
information gain for the greedy policy is 

Ha-H2 = \ log(l + 5)(1 + 1) = ^ log(12). 

Next we solve for the optimal solution. Let a = [01,02]"^. 
By Q, we have 



and 



21 + 5 



1/2 



21 + 5 



1/2- 



Pi = 



bai+3 



-(5aia2 — 2) 



3a^+4aia2+3a^ + l 3a^+4aia2+3a^ + l 
-(5aia2 — 2) 5a^+3 



3a-^+4aia2+3a2 + l 



3a^+4aia2+3a^ + l 



We compute that 



A} 



(25af + bOalal - SOaiOa + 25a^ + 16) 



+ 



5a2 4 



8aia2 



6a^ + 8aia2 



6a2 



When we choose b in the second stage, we can simply 
maximize the information gain in that stage. In this special 
case when M = 2, the second stage is the actually the last 
one. If a is given, maximizing the net information gain is 
equivalent to maximizing the information gain in the second 
stage. Therefore, the second step is equivalent to a greedy step. 



By (14 1 



-ff 1 - H-i 
By ( [lT| i, we know 



= ilog(l + A}). 



1 + 1/A} 



1 



-logdet 
log (4 + 40102) . 



PqUO^ 

a'^Poa + 1 



10 / ' \ 10 / 
bll < 1 such This implies that the greedy policy is not optimal 
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Using ||a|| = 1, we simplify the sum of (35 1 and (36 1 and 
obtain 



+ 19 



i log (i((41- 800102)1/2 
- 80102) 



(37) 



This expression reaches its maximal value when 01O2 = 1/5. 
So the optimal net information gain is ^ log(12.8), when 
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